Uncertain Centroid based Partitional Clustering of Uncertain Data

نویسندگان

  • Francesco Gullo
  • Andrea Tagarelli
چکیده

Clustering uncertain data has emerged as a challenging task in uncertain data management and mining. Thanks to a computational complexity advantage over other clustering paradigms, partitional clustering has been particularly studied and a number of algorithms have been developed. While existing proposals differ mainly in the notions of cluster centroid and clustering objective function, little attention has been given to an analysis of their characteristics and limits. In this work, we theoretically investigate major existing methods of partitional clustering, and alternatively propose a well-founded approach to clustering uncertain data based on a novel notion of cluster centroid. A cluster centroid is seen as an uncertain object defined in terms of a random variable whose realizations are derived based on all deterministic representations of the objects to be clustered. As demonstrated theoretically and experimentally, this allows for better representing a cluster of uncertain objects, thus supporting a consistently improved clustering performance while maintaining comparable efficiency with existing partitional clustering algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of Fuzzy Data Sets Based on Particle Swarm Optimization With Fuzzy Cluster Centers

In current study, a particle swarm clustering method is suggested for clustering triangular fuzzy data. This clustering method can find fuzzy cluster centers in the proposed method, where fuzzy cluster centers contain more points from the corresponding cluster, the higher clustering accuracy. Also, triangular fuzzy numbers are utilized to demonstrate uncertain data. To compare triangular fuzzy ...

متن کامل

MMVar: Clustering Uncertain Objects via Minimization of the Variance of Cluster Mixture Models

A major issue in clustering uncertain objects is related to the poor efficiency of existing algorithms, which is mainly due to expensive computation of the distance between uncertain objects. This paper discusses how we addressed this issue through an original formulation of the problem of clustering uncertain objects based on the minimization of the variance of the mixture models that represen...

متن کامل

CUSTOMER CLUSTERING BASED ON FACTORS OF CUSTOMER LIFETIME VALUE WITH DATA MINING TECHNIQUE

Organizations have used Customer Lifetime Value (CLV) as an appropriate pattern to classify their customers. Data mining techniques have enabled organizations to analyze their customers’ behaviors more quantitatively. This research has been carried out to cluster customers based on factors of CLV model including length, recency, frequency, and monetary (LRFM) through data mining. Based on LRFM,...

متن کامل

Uncertain Data Clustering in Distributed Peer-to-Peer Networks.

Uncertain data clustering has been recognized as an essential task in the research of data mining. Many centralized clustering algorithms are extended by defining new distance or similarity measurements to tackle this issue. With the fast development of network applications, these centralized methods show their limitations in conducting data clustering in a large dynamic distributed peer-to-pee...

متن کامل

Density-Based Clustering Based on Probability Distribution for Uncertain Data

Today we have seen so much digital uncertain data produced. Handling of this uncertain data is very difficult. Commonly, the distance between these uncertain object descriptions are expressed by one numerical distance value. Clustering on uncertain data is one of the essential and challenging tasks in mining uncertain data. The previous methods extend partitioning clustering methods like k-mean...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2012